AITopics | Marion County

Collaborating Authors

Marion County

Scalable Unit Harmonization in Medical Informatics via Bayesian-Optimized Retrieval and Transformer-Based Re-ranking

arXiv.org Artificial IntelligenceNov-18-2025

Objective: To develop and evaluate a scalable methodology for harmonizing inconsistent units in large-scale clinical datasets, addressing a key barrier to data interoperability. Materials and Methods: We designed a novel unit harmonization system combining BM25, sentence embeddings, Bayesian optimization, and a bidirectional transformer based binary classifier for retrieving and matching laboratory test entries. The system was evaluated using the Optum Clinformatics Datamart dataset (7.5 billion entries). We implemented a multi-stage pipeline: filtering, identification, harmonization proposal generation, automated re-ranking, and manual validation. Performance was assessed using Mean Reciprocal Rank (MRR) and other standard information retrieval metrics. Results: Our hybrid retrieval approach combining BM25 and sentence embeddings (MRR: 0.8833) significantly outperformed both lexical-only (MRR: 0.7985) and embedding-only (MRR: 0.5277) approaches. The transformer-based reranker further improved performance (absolute MRR improvement: 0.10), bringing the final system MRR to 0.9833. The system achieved 83.39\% precision at rank 1 and 94.66\% recall at rank 5. Discussion: The hybrid architecture effectively leverages the complementary strengths of lexical and semantic approaches. The reranker addresses cases where initial retrieval components make errors due to complex semantic relationships in medical terminology. Conclusion: Our framework provides an efficient, scalable solution for unit harmonization in clinical datasets, reducing manual effort while improving accuracy. Once harmonized, data can be reused seamlessly in different analyses, ensuring consistency across healthcare systems and enabling more reliable multi-institutional studies and meta-analyses.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.ijmedinf.2025.106180

2505.0081

Country:

North America > United States > Mississippi > Marion County (0.04)
North America > United States > Minnesota > Hennepin County > Eden Prairie (0.04)
Indian Ocean > Red Sea (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Knots: A Large-Scale Multi-Agent Enhanced Expert-Annotated Dataset and LLM Prompt Optimization for NOTAM Semantic Parsing

Liu, Maoqi, Fang, Quan, Yang, Yang, Zhao, Can, Cai, Kaiquan

arXiv.org Artificial IntelligenceNov-18-2025

Notice to Air Missions (NOTAMs) serve as a critical channel for disseminating key flight safety information, yet their complex linguistic structures and implicit reasoning pose significant challenges for automated parsing. Existing research mainly focuses on surface-level tasks such as classification and named entity recognition, lacking deep semantic understanding. To address this gap, we propose NOTAM semantic parsing, a task emphasizing semantic inference and the integration of aviation domain knowledge to produce structured, inference-rich outputs. To support this task, we construct Knots (Knowledge and NOTAM Semantics), a high-quality dataset of 12,347 expert-annotated NOTAMs covering 194 Flight Information Regions, enhanced through a multi-agent collaborative framework for comprehensive field discovery. We systematically evaluate a wide range of prompt-engineering strategies and model-adaptation techniques, achieving substantial improvements in aviation text understanding and processing. Our experimental results demonstrate the effectiveness of the proposed approach and offer valuable insights for automated NOTAM analysis systems. Our code is available at: https://github.com/Estrellajer/Knots.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.1263

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > Mississippi > Marion County (0.04)
North America > United States > Louisiana (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Air (1.00)
Transportation > Infrastructure & Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

CountingFruit: Language-Guided 3D Fruit Counting with Semantic Gaussian Splatting

Li, Fengze, Liu, Yangle, Ma, Jieming, Liang, Hai-Ning, Shen, Yaochun, Li, Huangxiang, Wu, Zhijing

arXiv.org Artificial IntelligenceAug-8-2025

Accurate 3D fruit counting in orchards is challenging due to heavy occlusion, semantic ambiguity between fruits and surrounding structures, and the high computational cost of volumetric reconstruction. Existing pipelines often rely on multi-view 2D segmentation and dense volumetric sampling, which lead to accumulated fusion errors and slow inference. We introduce FruitLangGS, a language-guided 3D fruit counting framework that reconstructs orchard-scale scenes using an adaptive-density Gaussian Splatting pipeline with radius-aware pruning and tile-based rasterization, enabling scalable 3D representation. During inference, compressed CLIP-aligned semantic vectors embedded in each Gaussian are filtered via a dual-threshold cosine similarity mechanism, retrieving Gaussians relevant to target prompts while suppressing common distractors (e.g., foliage), without requiring retraining or image-space masks. The selected Gaussians are then sampled into dense point clouds and clustered geometrically to estimate fruit instances, remaining robust under severe occlusion and viewpoint variation. Experiments on nine different orchard-scale datasets demonstrate that FruitLangGS consistently outperforms existing pipelines in instance counting recall, avoiding multi-view segmentation fusion errors and achieving up to 99.7% recall on Pfuji-Size_Orch2018 orchard dataset. Ablation studies further confirm that language-conditioned semantic embedding and dual-threshold prompt filtering are essential for suppressing distractors and improving counting accuracy under heavy occlusion. Beyond fruit counting, the same framework enables prompt-driven 3D semantic retrieval without retraining, highlighting the potential of language-guided 3D perception for scalable agricultural scene understanding.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.01109

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > China > Guangdong Province > Guangzhou (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Food & Agriculture > Agriculture (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Comprehensive Survey on Deep Learning Solutions for 3D Flood Mapping

Jia, Wenfeng, Liang, Bin, Liu, Yuxi, Khan, Muhammad Arif, Zheng, Lihong

arXiv.org Artificial IntelligenceJul-14-2025

Flooding remains a major global challenge, worsened by climate change and urbanization, demanding advanced solutions for effective disaster management. While traditional 2D flood mapping techniques provide limited insights, 3D flood mapping, powered by deep learning (DL), offers enhanced capabilities by integrating flood extent and depth. This paper presents a comprehensive survey of deep learning-based 3D flood mapping, emphasizing its advancements over 2D maps by integrating flood extent and depth for effective disaster management and urban planning. The survey categorizes deep learning techniques into task decomposition and end-to-end approaches, applicable to both static and dynamic flood features. We compare key DL architectures, highlighting their respective roles in enhancing prediction accuracy and computational efficiency. Additionally, this work explores diverse data sources such as digital elevation models, satellite imagery, rainfall, and simulated data, outlining their roles in 3D flood mapping. The applications reviewed range from real-time flood prediction to long-term urban planning and risk assessment. However, significant challenges persist, including data scarcity, model interpretability, and integration with traditional hydrodynamic models. This survey concludes by suggesting future directions to address these limitations, focusing on enhanced datasets, improved models, and policy implications for flood management. This survey aims to guide researchers and practitioners in leveraging DL techniques for more robust and reliable 3D flood mapping, fostering improved flood management strategies.

artificial intelligence, machine learning, survey article, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-981-96-8295-9_2

2506.13201

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Mississippi > Marion County (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
(4 more...)

Genre: Overview (1.00)

Industry:

Education (0.76)
Information Technology > Security & Privacy (0.49)
Law (0.48)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Medical VIE via Reinforcement Learning

Liu, Lijun, Li, Ruiyang, Liu, Zhaocheng, Zhu, Chenglin, Li, Chong, Cheng, Jiehan, Ju, Qiang, Xie, Jian

arXiv.org Artificial IntelligenceJun-17-2025

Visual Information Extraction (VIE) converts unstructured document images into structured formats like JSON, critical for medical applications such as report analysis and online consultations. Traditional methods rely on OCR and language models, while end-to-end multimodal models offer direct JSON generation. However, domain-specific schemas and high annotation costs limit their effectiveness in medical VIE. We base our approach on the Reinforcement Learning with Verifiable Rewards (RLVR) framework to address these challenges using only 100 annotated samples. Our approach ensures dataset diversity, a balanced precision-recall reward mechanism to reduce hallucinations and improve field coverage, and innovative sampling strategies to enhance reasoning capabilities. Fine-tuning Qwen2.5-VL-7B with our RLVR method, we achieve state-of-the-art performance on medical VIE tasks, significantly improving F1, precision, and recall. While our models excel on tasks similar to medical datasets, performance drops on dissimilar tasks, highlighting the need for domain-specific optimization. Case studies further demonstrate the value of reasoning during training and inference for VIE.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.13363

Country:

North America > United States > Mississippi > Marion County (0.24)
Asia > China (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

Data-Driven Soil Organic Carbon Sampling: Integrating Spectral Clustering with Conditioned Latin Hypercube Optimization

Zhao, Weiying, Unagaev, Aleksei, Efremova, Natalia

arXiv.org Artificial IntelligenceJun-17-2025

Soil organic carbon (SOC) monitoring often relies on selecting representative field sampling locations based on environmental covariates. We propose a novel hybrid methodology that integrates spectral clustering - an unsupervised machine learning technique with conditioned Latin hypercube sampling (cLHS) to enhance the representativeness of SOC sampling. In our approach, spectral clustering partitions the study area into $K$ homogeneous zones using multivariate covariate data, and cLHS is then applied within each zone to select sampling locations that collectively capture the full diversity of environmental conditions. This hybrid spectral-cLHS method ensures that even minor but important environmental clusters are sampled, addressing a key limitation of vanilla cLHS which can overlook such areas. We demonstrate on a real SOC mapping dataset that spectral-cLHS provides more uniform coverage of covariate feature space and spatial heterogeneity than standard cLHS. This improved sampling design has the potential to yield more accurate SOC predictions by providing better-balanced training data for machine learning models.

artificial intelligence, feature space, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2506.10419

Country: North America > United States > Mississippi > Marion County (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

An Automated Pipeline for Few-Shot Bird Call Classification: A Case Study with the Tooth-Billed Pigeon

Jana, Abhishek, Uili, Moeumu, Atherton, James, O'Brien, Mark, Wood, Joe, Brickson, Leandra

arXiv.org Artificial IntelligenceMay-5-2025

This paper presents an automated one-shot bird call classification pipeline designed for rare species absent from large publicly available classifiers like BirdNET and Perch. While these models excel at detecting common birds with abundant training data, they lack options for species with only 1-3 known recordings-a critical limitation for conservationists monitoring the last remaining individuals of endangered birds. To address this, we leverage the embedding space of large bird classification networks and develop a classifier using cosine similarity, combined with filtering and denoising preprocessing techniques, to optimize detection with minimal training data. We evaluate various embedding spaces using clustering metrics and validate our approach in both a simulated scenario with Xeno-Canto recordings and a real-world test on the critically endangered tooth-billed pigeon (Didunculus strigirostris), which has no existing classifiers and only three confirmed recordings. The final model achieved 1.0 recall and 0.95 accuracy in detecting tooth-billed pigeon calls, making it practical for use in the field. This open-source system provides a practical tool for conservationists seeking to detect and monitor rare species on the brink of extinction.

automated pipeline, data quality, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2504.16276

Country:

North America > United States > Kentucky (0.04)
Oceania > Samoa > Gagaifomauga > Safotu (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Government (0.46)
Law > Environmental Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Quality (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Intrinsic Gaussian Vector Fields on Manifolds

Robert-Nicoud, Daniel, Krause, Andreas, Borovitskiy, Viacheslav

arXiv.org Machine LearningOct-28-2023

Various applications ranging from robotics to climate science require modeling signals on non-Euclidean domains, such as the sphere. Gaussian process models on manifolds have recently been proposed for such tasks, in particular when uncertainty quantification is needed. In the manifold setting, vector-valued signals can behave very differently from scalar-valued ones, with much of the progress so far focused on modeling the latter. The former, however, are crucial for many applications, such as modeling wind speeds or force fields of unknown dynamical systems. In this paper, we propose novel Gaussian process models for vector-valued signals on manifolds that are intrinsically defined and account for the geometry of the space in consideration. We provide computational primitives needed to deploy the resulting Hodge-Mat\'ern Gaussian vector fields on the two-dimensional sphere and the hypertori. Further, we highlight two generalization directions: discrete two-dimensional meshes and "ideal" manifolds like hyperspheres, Lie groups, and homogeneous spaces. Finally, we show that our Gaussian vector fields constitute considerably more refined inductive biases than the extrinsic fields proposed before.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Machine Learning

2310.18824

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
South America > Chile (0.04)
North America > United States > Mississippi > Marion County (0.04)
(5 more...)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Neural Stream Functions

Wurster, Skylar Wolfgang, Guo, Hanqi, Peterka, Tom, Shen, Han-Wei

arXiv.org Artificial IntelligenceJul-16-2023

We present a neural network approach to compute stream functions, which are scalar functions with gradients orthogonal to a given vector field. As a result, isosurfaces of the stream function extract stream surfaces, which can be visualized to analyze flow features. Our approach takes a vector field as input and trains an implicit neural representation to learn a stream function for that vector field. The network learns to map input coordinates to a stream function value by minimizing the inner product of the gradient of the neural network's output and the vector field. Since stream function solutions may not be unique, we give optional constraints for the network to learn particular stream functions of interest. Specifically, we introduce regularizing loss functions that can optionally be used to generate stream function solutions whose stream surfaces follow the flow field's curvature, or that can learn a stream function that includes a stream surface passing through a seeding rake. We also discuss considerations for properly visualizing the trained implicit network and extracting artifact-free surfaces. We compare our results with other implicit solutions and present qualitative and quantitative results for several synthetic and simulated vector fields.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/PacificVis56936.2023.00022

2307.08142

Country:

Europe > United Kingdom > North Sea > Southern North Sea (0.04)
North America > United States > Ohio (0.04)
North America > United States > Oklahoma > Beaver County (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

AI Techniques in the Microservices Life-Cycle: A Survey

Moreschini, Sergio, Pour, Shahrzad, Lanese, Ivan, Balouek-Thomert, Daniel, Bogner, Justus, Li, Xiaozhou, Pecorelli, Fabiano, Soldani, Jacopo, Truyen, Eddy, Taibi, Davide

arXiv.org Artificial IntelligenceMay-25-2023

Microservices is a popular architectural style for the development of distributed software, with an emphasis on modularity, scalability, and flexibility. Indeed, in microservice systems, functionalities are provided by loosely coupled, small services, each focusing on a specific business capability. Building a system according to the microservices architectural style brings a number of challenges, mainly related to how the different microservices are deployed and coordinated and how they interact. In this paper, we provide a survey about how techniques in the area of Artificial Intelligence have been used to tackle these challenges.

data mining, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.16092

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > India (0.04)
North America > Canada (0.04)
(42 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Services (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(5 more...)

Add feedback